Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

--get "{query:$NAME}" only returns the first match #302

Closed
yahesh opened this issue May 14, 2024 · 15 comments
Closed

--get "{query:$NAME}" only returns the first match #302

yahesh opened this issue May 14, 2024 · 15 comments

Comments

@yahesh
Copy link

yahesh commented May 14, 2024

I learned about trurl today and gave it go. You can get the value of a query parameter through --get "{query:$NAME}". It's also possible to add multiple query parameters with the same name through --append "query=$NAME=$VALUE1" --append "query=$NAME=$VALUE2". However, --get "{query:$NAME}" will only retrieve the first value.

This might lead to lost information. Some implementations (like PHP) support multiple values for a query parameter (like $NAME[]=$VALUE1&$NAME[]=$VALUE2). Some implementations (like PHP if the query name does not end in []) use the last occurrence of a query parameter instead of the first.

It would be great to have a way to get all values of a specific query parameter name returned.

@bagder
Copy link
Member

bagder commented May 14, 2024

Agreed. Should it require a different option? How should it provide them in the output?

@yahesh
Copy link
Author

yahesh commented May 14, 2024

Given that parsing the output of --get "{query:$NAME} is risky already due to the URL-decoding, one could print the values of the different values in a tab-separated list so that handling a list of URLs is still supported. The support for outputting all query parameter values should probably be put behind a command option like --all-values.

So this could look like this:

% trurl "https://localhost?name=value1&name=value2" --get "{query:name}" --all-values
value1	value2

% trurl "https://localhost?name=value1&name=value2" "https://localhost?name=value3&name=value4" --get "{query:name}" --all-values
value1	value2
value3	value4

There are risks attached to this if the value contains characters like line breaks and tabs due them getting URL-decoded by default:

% trurl "https://localhost?name=value1&name=value2%0Avalue3%09value4" --get "{query:name}" --all-values
value1	value2
value3	value4

This can be circumvented by keeping the URL-encoding in place:

% trurl "https://localhost?name=value1&name=value2%0Avalue3%09value4" --get "{:query:name}" --all-values
value1	value2%0Avalue3%09value4

@bagder
Copy link
Member

bagder commented May 14, 2024

Given that parsing the output of --get "{query:$NAME} is risky already due to the URL-decoding

  1. Why is that risky? It is only risky if you do funny assumptions about what you get.
  2. It already supports getting the data URL encoded with --get {:query}.

@bagder
Copy link
Member

bagder commented May 14, 2024

But yeah, when the output can be "anything", making separators becomes difficult.

@bagder
Copy link
Member

bagder commented May 14, 2024

I suppose one way could be to allow trurl to mention how many instances there are, as then a user could cherry-pick the exact instance. For example: (brain-storming mode engaged)

$ trurl "example.com?a=firsrt&a=second&a=third" --get {query:a:amount}`
3
$ trurl "example.com?a=firsrt&a=second&a=third" --get {query:a:2}`
second

@yahesh
Copy link
Author

yahesh commented May 14, 2024

From a shell-user perspective I'd personally prefer a GREPable output format. The counting and index features would require loops while something like a tab-separated list does not. In the case of a GREPable output format one could just string the trurl call together with an xargs call, etc.

@bagder
Copy link
Member

bagder commented May 14, 2024

How would a tab in the content be separated from the tab as a separator?

@bagder
Copy link
Member

bagder commented May 14, 2024

Ah, we do have:

trurl "https://localhost?name1=value&name1=again" --get '{query-all:name1}'

@bagder
Copy link
Member

bagder commented May 14, 2024

(Which has the separator problem but with space..)

@yahesh
Copy link
Author

yahesh commented May 14, 2024

How would a tab in the content be separated from the tab as a separator?

It wouldn't in a secure way just like it currently isn't securely possible in this case:

% trurl "localhost?name=value1" "localhost?name=value2%0Avalue3" --get "{query:name}"
value1
value2
value3

But with keeping the URL-encoding in place, this is not an issue:

% trurl "localhost?name=value1" "localhost?name=value2%0Avalue3" --get "{:query:name}"
value1
value2%0Avalue3

Ah, we do have:

trurl "https://localhost?name1=value&name1=again" --get '{query-all:name1}'

Ah, I must have overlooked that in the manpage.

(Which has the separator problem but with space..)

Which is equivalent to the thing mentioned above. Keeping the URL-encoding in place during the parsing stage solves the separator problem:

% trurl "localhost?name=value1" "localhost?name=value2%0Avalue3&name=value4" --get "{:query-all:name}"
value1
value2%0Avalue3 value4

@bagder
Copy link
Member

bagder commented May 14, 2024

Maybe we should offer the user a way to specify the separator as a custom string, as then a user can increase the chances of it working...

@yahesh
Copy link
Author

yahesh commented May 14, 2024

As {query-all:$NAME} exists but is a bit hard to find, it would be fine for me if query-all would be mentioned in the output of --help.

@yahesh
Copy link
Author

yahesh commented May 14, 2024

Maybe we should offer the user a way to specify the separator as a custom string, as then a user can increase the chances of it working...

I personally don't think that will help a lot with handling query parameters in a secure way. Users already have the option to replace the separator with something of their liking via tr. Inline-separators will never be 100% secure when users don't take special characters from the URL-decoding process into account.

A safe way for users to handle separators e.g. looks like this:

% trurl "localhost?name=value1%20value2&name=value3" --get "{:query-all:name}" | xargs -n1
value1%20value2
value3

That's even the case with --accept-space enabled:

% trurl "localhost?name=value1 value2&name=value3" --accept-space --get "{:query-all:name}" | xargs -n1
value1+value2
value3

@bagder
Copy link
Member

bagder commented May 14, 2024

I personally don't think that will help a lot with handling query parameters in a secure way

If you set the separator to be for example 16 random letters, the risk they will be in the content is next to zero. But of course then the user needs to split the content into separate parts using the same boundary. Which admittedly is inconvenient.

It could be made to work like this:

$ trurl --boundary OOO example.com/?name=a&name=b&name=c --get "{query-all:name}"
OOOaOOObOOOc

Keeping the content URL encoded is another way and one that already is there, but then the user still needs to decode the data in a second inconvenient step.

@bagder
Copy link
Member

bagder commented May 17, 2024

I think that since this feature is actually already supported we can just call this solved. We can of course most certainly still improve the option to become more "intuitive" and easy to find in -h output and manpage etc.

@bagder bagder closed this as completed May 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants